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The present experiment examined whether a response class was acquired by humans with delayed 
reinforcement. Eight white circles were presented on a computer touch screen. If the undergraduates 
touched two of the eight circles in a specified sequence (i.e., touching first the upper-left circle then the 
bottom-left circle), then the touches initiated an unsignaled resetting delay culminating in point 
delivery. Participants experienced one of three different delays (Os, 10 s, or 30 s). Rates of the target 
two-response sequence were higher with delayed reinforcement than with no reinforcement. Terminal 
rates of the target sequence decreased and postreinforcement pauses increased as a function of delay 
duration. Other undergraduates exposed to yoked schedules of response-independent point deliveries 
failed to acquire the sequence. The results demonstrate that a response class was acquired with delayed 
reinforcement, extending the generality of this phenomenon found with nonhuman animals to 
humans. 
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Lattal and Gleeson (1990) demonstrated 
that, in the absence of shaping, bar pressing or 
key pecking could be acquired in free-operant 
sessions by food-deprived rats and pigeons 
when the response produced food after 
unsignaled resetting or nonresetting delays of 
up to 30 s. Since this seminal work appeared, 
the finding of response acquisition with 
delayed reinforcement has been replicated by 
several investigators (e.g., Anderson & Elcoro, 
2007; Dickinson, Watt, & Griffiths, 1992; 
Galuska & Woods, 2005; Schlinger & Blakely, 
1994; van Haaren, 1992; Wilkenfield, Nickel, 
Blakely, & Poling, 1992), with different re- 
sponse topographies (Critchfield & Lattal, 
1993), different reinforcers (Galuska & 
Woods, 2005; Lattal & Metzger, 1994; LeSage, 
Byrne, & Poling, 1996; Snycerski, Laraway, & 
Poling, 2005), and different species (Galuska 
& Woods, 2005; Lattal & Metzger, 1994) . 

It is surprising that there is no published 
work demonstrating response acquisition with 
delayed reinforcement by humans because the 
generality of this effect extends from Siamese 
fighting fish (Lattal & Metzger, 1994) to rhesus 

The author thanks Andy Lattal for his very helpful 
comments and suggestions on this manuscript. Some of 
these data were reported at the 33 rd Annual Convention of 
the Association for Behavior Analysis International, San 
Diego, May, 2007. 

Correspondence may be addressed to Hiroto Okouchi, 
Department of Psychology, Osaka Kyoiku University, 4-698- 
1 Asahigaoka, Kashiwara, Osaka 582-8582, Japan (e-mail: 
okouchi@cc.osaka-kyoiku.ac.jp). 

doi: 10. 1901/jeab. 2009. 91-377 


monkeys (Galuska & Woods, 2005) . Based on 
casual observation of daily behavior, one may 
suppose that humans obviously can acquire a 
new response with delayed reinforcement. It 
seems self-evident, and, therefore, may have 
been ignored by investigators. Some findings 
from the experimental analysis of human 
behavior, however, suggest that this issue 
needs careful consideration. 

First, human behavior is often said not to 
show the same sensitivity to schedules of 
reinforcement such as fixed-interval (FI), 
multiple fixed-ratio (FR) differential-reinforce- 
ment-of-low-rate (DRL), or concurrent vari- 
able-interval (VI) VI, as the behavior of 
nonhuman animals (e.g., Baron, Kaufman, & 
Stauber, 1969; Hayes, Brownstein, Zettle, 
Rosenfarb, & Korn, 1986; Horne & Lowe, 
1993) . Such human-nonhuman differences 
are still a matter of controversy. Some investi- 
gators have argued that important procedural 
variations between human and nonhuman 
research may account for many of the behav- 
ioral differences (e.g., Perone, Galizio, & 
Baron, 1988; Weiner, 1983), while others have 
suggested that there are fundamental differ- 
ences in the principles that govern human and 
nonhuman behavior (e.g., Lowe, 1979; Wear- 
den, 1988). Contingencies arranged in exper- 
iments of response acquisition with delayed 
reinforcement, such as tandem FR 1 differ- 
ential-reinforcement-of-other-behavior (DRO) 
schedules (e.g., Lattal & Gleeson, 1990), are, 
of course, types of schedules of reinforcement. 
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Thus, it is unknown whether schedules of 
delayed reinforcement have the same effects 
on human and nonhuman behavior. 

Second, verbal stimuli may influence the 
cases of response acquisition by humans with 
delayed reinforcement observed casually in 
daily life. Considerable research has shown 
formidable effects of verbal stimuli on human 
operant behavior (see reviews by Baron & 
Galizio, 1983; Kerr & Keenan, 1997; Vaughan, 
1989). Instructing which response produces 
reinforcers can establish the response without 
shaping (e.g., Kaufman, Baron, & Kopp, 1966; 
Weiner, 1962) and describing a relation 
between responses and reinforcers can main- 
tain the responses even though those respons- 
es never produce a reinforcer (e.g., Kaufman 
et al., 1966). Thus, it is plausible that humans 
can emit responses with delayed reinforce- 
ment when (a) the responses produce rein- 
forcers or (b) the response-reinforcer relation 
was instructed. Previous findings provide no 
prediction whether humans can respond with 
delayed reinforcement without such instruc- 
tions. 

Finally, the generally employed class of 
response in the experimental analysis of 
human behavior may have made the demon- 
stration of response acquisition difficult. Usu- 
ally, one response to an operandum, such as a 
press to a key, has been defined as an operant 
(e.g., Weiner, 1962). The operant response 
has been established by instructions specifying 
that the response produces reinforcers in 
many studies (e.g., Catania, Matthews, & 
Sliimoff, 1982; Okouchi, 2003; Torgrud & 
Holborn, 1990; Weiner, 1962) or by shaping 
in a few studies (Lowe, Beasty, & Bentall, 1983; 
Matthews, Shimoff, Catania, & Sagvolden, 
1977). Because previous experiments have 
demonstrated that nonhuman animals ac- 
quired responses in the absence of shaping 
(e.g., Lattal & Gleeson, 1990), specific instruc- 
tions or shaping should be omitted for a 
systematic replication using human subjects. 
Even though neither instructions nor shaping 
are given, some humans still may respond to 
the operandum. In Experiment 3 of Kaufman 
et al. (1966), for example, four college 
students pressed a key 7 to 161 times per 
minute with no reinforcer deliveries during 
the first 30 min of the experiment, although 
neither shaping nor instructions about the 
operandum were given. On the other hand, 


the operant level of pressing a single bar for 
one college student was zero without instruc- 
tions or shaping (DeGrandpre, Buskist, & 
Cush, 1990; also see Ader & Tatum, 1961; 
Ayllon & Azrin, 1964) . Response acquisition is 
the transition from operant-level responding 
to steady-state responding (Sidman, 1960, 
pp. 117-119; Snycerski, Laraway, Huitema, & 
Poling, 2004) . Thus, the operant level of the 
commonly-used class of response (a response 
to an operandum) often may be too high to 
detect reliable evidence of acquisition. This 
would be the case especially when the resetting 
delay is as long as 30 s, because this contin- 
gency restricts the rates of responding to 
around two responses per min to produce 
the maximum rates of reinforcement. By 
contrast, any response with an operant level 
at zero can not be acquired without shaping. 
Thus, an examination of the generality of 
response acquisition with delayed reinforce- 
ment using human participants requires a class 
of response with low but nonzero baseline 
rates (Critchfield & Lattal, 1993; Sidman, 
1960, pp. 117-119), and it is unlikely that 
single responses to an operandum would meet 
this requirement. 

The present experiment examined response 
acquisition by humans with delayed reinforce- 
ment. For this purpose, a two-response se- 
quence was selected as the class of response 
because its operant level should be low but 
above zero. Eight filled white circles were 
presented on a black touch screen of a 
monitor. Touches on two of the eight circles 
in a specified sequence initiated an unsignaled 
0-s, 10-s, or 30-s resetting delay, depending on 
the condition, culminating in point delivery 
exchangeable for money. Rates of the two- 
response sequences for participants exposed 
to these delayed reinforcement conditions 
were compared with (a) those when no 
reinforcer was delivered and (b) those for 
yoked-control participants to whom points 
were delivered independently from their per- 
formance. 

METHOD 

Participants 

Nine male and 21 female undergraduates 
recruited from an educational psychology class 
at Osaka Kyoiku University served as partici- 
pants. They were 19 to 29 years old, and none 
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Fig. 1. A depiction of the stimuli displayed to the 
subject on the computer screen. A two-response se- 
quence — a touch to the bottom-left circle following a 
touch to the upper-left circle — was the response class 
(target sequence) in the present experiment. 

had experience with operant conditioning 
experiments. 

Apparatus 

The experimental room was 1.70 m wide, 
2.20 m deep, and 2.17 m high. A Nihon 
Electric Company PC-982 1AP microcomputer, 
located in an adjacent room, controlled the 
experiment. The participant sat at a desk 
facing a color display monitor (250 mm wide 
by 180 mm high) equipped with a Micro 
Touch Systems touch screen. During the 
experimental session, three, two, and three 
filled white circles (30-mm diameter for each) 
were presented on the top (45 mm below the 
top of the screen), middle (90 mm below the 
top of the screen), and bottom (135 mm 
below the top of the screen) of the black 
screen, respectively. Figure 1 shows the con- 
figuration of these eight circles. At the top and 
bottom of the screen, three circles were 
presented at the left, center, and right, 
respectively, 45 mm apart from center to 
center, whereas two circles were presented at 
the left and right in the middle of the screen, 
respectively, 90 mm apart from center to 
center. When the schedule requirement was 
met, these eight circles disappeared and one 
filled white circle (30 mm diameter) for the 
“consunnnatory response” was presented in 
the middle center of the screen (not shown in 
Figure 1). All interevent times were recorded 
in real time with a 50-ms resolution. Each 


touch to the eight circles was accompanied by 
a low tone (880 Hz), and the appearance of 
the middle-center circle (delivery of a rein- 
forcer) and each touch to that circle (consum- 
ing the reinforcer) were accompanied by a 
high tone (1760 Hz). A digital counter was 
located at the top right of the screen. 

Class of Response 

The operant level of the response class in 
the present experiment needed to be low. To 
determine that class, operant levels of touch- 
ing the eight circles were measured for 5 
undergraduates other than the ones partici- 
pating in the present experiment. The proce- 
dure in this pilot experiment was identical to 
that of the nonreinforcement condition de- 
scribed below. Thus, these pilot participants 
were exposed to the screen showing the eight 
circles (Figure 1), were told that they might 
earn points by touching the circles, but 
actually no points were delivered. Under this 
condition, every touch to each circle was 
recorded for 10 min. One of the most infre- 
quent two-response sequences, touching the 
upper-left circle then the bottom-left one, was 
selected as the present response class. During 
the 10 min, this sequence of responses did not 
occur for 4 of the 5 pilot participants and 
occurred only once for the other participant. 

Procedure 

Participants signed an informed consent 
agreement that specified the frequency and 
duration of their participation and the average 
earnings for such participation. They agreed 
to remain in the experiment for a maximum 
of eight 90-min experimental periods. At the 
beginning and at the end of the experiment, 
each participant was asked not to speak to 
anyone other than the experimenter about the 
study in an attempt to prevent discussion 
about the contingencies among participants 
(cf. Horne & Lowe, 1993). At the end of the 
experiment, each participant was asked wheth- 
er he or she had any information to offer 
about the study. All reported that they did not. 

A 90-min experimental period was conducted 
once per day, two times per week. During this 
period, a maximum of seven variable-duration 
sessions occurred. Sessions were separated by 2- 
to 3-min breaks. After every experimental 
period, participants were paid for their perfor- 
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mance (2 yen per 100 points, approximately 
.018 U.S. dollars). Upon completion of the 
entire experiment, participants were paid for 
their participation (200 yen per 90 min) and 
were debriefed. The overall earnings for each 
participant who completed the entire experi- 
ment ranged from 2,320 to 3,370 yen (approx- 
imately 20.88 to 30.33 U.S. dollars). 

On the first day of the experiment, each 
participant was asked to leave wristwatches, 
cellular phones, and books outside the experi- 
mental room. Once in the room, the participant 
was asked to read silently the following instruc- 
tions (translated here from Japanese to English) : 

Your task is to earn as many points as you can. 

A hundred points are worth two yen. Payment 
for the points will be made at the end of each 
visit. In addition, you will be paid 200 yen for 
every day you spend in the experiment. 
Payment for participation will be made at the 
end of the last visit. 

Eight circles will be shown on the screen of the 
display monitor. Sometimes these circles may 
disappear, then a circle will appear in the center 
of the screen. By touching the center circle, you 
can earn points. Accumulated points will be 
shown in the top right of the screen. 

It is up to you whether you touch the eight 
circles or not. If you touch these circles, the 
center circle may appear. Or, the appearance 
of the center circle may be unrelated to 
touching the eight circles. 

The words “READY’ and “GO” will appear in 
sequence on the screen. When the word “GO” 
disappears, do the task until the words “GAME 
OVER” appear on the screen. 

The typed set of instructions remained on 
the desk throughout the experiment. Ques- 
tions regarding the experimental procedure 
were answered by telling the participant to 
reread the appropriate sections of the instruc- 
tions. Then the words “READY’ and “GO” 
were presented in sequence at the top left of 
the display monitor. After the word “GO” 
disappeared, eight circles were presented on 
the screen as shown in Figure 1. 

When the schedule requirement was met, 
the eight circles were darkened and the circle 
for the consummatory response was presented 
in the middle center of the screen. A touch 
darkened the circle and accumulated 100 
points on the top right counter, followed by 
the reappearance of the eight circles. After the 
session terminated, the words “GAME OVER” 
appeared at the top left of the screen. 


Participants were assigned to one of six 
conditions: one of three conditions of se- 
quence-dependent delayed reinforcement, or 
one of three yoked control conditions of 
sequence-independent reinforcement. All par- 
ticipants were exposed first to a nonreinforce- 
ment condition, then to a reinforcement 
condition. The procedure in the nonreinforce- 
ment condition was the same for all partici- 
pants and identical to that of the pilot 
experiment. In this condition, a 10-min single 
session was conducted to measure the operant 
level of the target response sequence (upper- 
left-then-bottom-left sequence) . 

The procedure of the reinforcement condi- 
tion differed across participants depending on 
the condition to which they were assigned. 
Participants in the conditions of sequence- 
dependent delayed reinforcement were ex- 
posed to a tandem FR1 DRO schedule. Under 
this schedule, a touch to the bottom left circle 
after a touch to the upper left circle initiated a 
delay of 0 s (the 0-s delay condition), 10 s (the 
10-s delay condition), or 30 s (the 30-s delay 
condition) to reinforcement, then the center 
circle appeared for the consummatory re- 
sponse (reinforced). Any two-response se- 
quence of upper-left-then-bottom-left that oc- 
curred during a delay restarted the delay 
interval. Responses other than this target 
sequence were without consequences. Each 
session lasted until 45 reinforcers occurred or 
60 min (including the time taken for consum- 
matory responses) elapsed, whichever came 
first. Each participant was exposed to this 
schedule for 20 sessions or until eight 90-min 
experimental periods were completed, which- 
ever came first. 

Participants in the yoked control conditions 
of sequence-independent reinforcement were 
exposed to identical contingencies to those of 
the sequence-dependent delayed reinforce- 
ment conditions with the following exceptions. 
The frequency and distribution of sequence- 
independent reinforcers, the session duration, 
and the number of sessions for each participant 
were yoked to those of one of the participants 
exposed to the tandem FR1 DRO schedules. 

Participant Screening 

Although the pilot data suggested that 
operant levels of the present response class 
would be low, it was still uncertain whether the 
level for each participant would be suitable for 
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answering the present research question. 
Thus, participants were initially screened 
according to their operant levels. 

First, because a relatively high level of 
responding makes valid evaluation of acquisi- 
tion difficult, participants emitting the target 
sequence more than four times during the 10- 
min nonreinforcement session were eliminat- 
ed from further study. Second, because re- 
sponses must occur to be reinforced (Wilken- 
field et al., 1992), participants never emitting 
the target sequence during the nonreinforce- 
ment and the first reinforcement sessions 
(70 min in total) were screened out. Finally, 
participants emitting the sequence less than a 
total of five times from the nonreinforcement 
session through the second reinforcement 
session (130 min in total) also were screened 
out because a sequence with such a low 
operant level might not be strengthened by 
reinforcement. For example, one pigeon in 
Experiment 1 of Lattal and Gleeson (1990) 
responded nine times during the first two 
sessions of a tandem FR1 fixed-time (FT) 30-s 
schedule but ceased responding for the next 
five sessions. The first criterion was applied to 
all participants, whereas the other two were 
applied only to participants in the sequence- 
dependent delayed reinforcement conditions. 

The first, second, and third criteria, respec- 
tively, resulted in the elimination of one (the 
30-s delay condition), three (one in the 0-s 
delay and two in the 10-s delay conditions) and 
two participants (the 30-s delay condition). 
Furthermore, two participants in the 30-s delay 
condition discontinued the experiment with 
no sign of response acquisition. One of them 
dropped out after exposure to two sessions of 
the tandem FR1 DRO 30-s schedule because of 
personal reasons (unrelated to the ongoing 
experiment) and another was dismissed dur- 
ing the fourth session of the delayed contin- 
gency because of a violation of the contract 
(bringing a book into the experimental room 
and reading it during the session). Thus, four, 
four, and three pairs of response-dependent 
delayed reinforcement participants and their 
yoked partners completed the 0-s, 10-s, and 30- 
s delay conditions, respectively. 

RESULTS 

Figures 2, 3, and 4, respectively, show the 
number of target sequences per minute for 


each participant exposed to the 0-s, 10-s, and 
30-s delays to reinforcement in each session. 
Data for the yoked participants are shown on 
the same panels as their partners in the 
delayed reinforcement condition. The rates 
of the sequence during the nonreinforcement 
session were zero or close to zero for all 
participants, suggesting that the operant levels 
of the target sequence were extremely low. 

When each target sequence produced a 
reinforcer immediately (the 0-s delay condi- 
tion), rates of the target sequence increased 
for all 4 participants (Figure 2). Although 
there was some variability in target sequence 
rate across sessions and participants, the rates 
for at least the last 19 sessions for all 
participants were higher than those during 
the nonreinforcement session and those of the 
yoked participants. 

Figure 3 shows the data from the 10-s delay 
condition. For all sessions for Participants E5 
and E6, and for the last 19 sessions for 
Participant E8, rates of the target sequences 
under the tandem FR1 DRO 10-s schedule 
were higher than those for the no reinforce- 
ment session and those for the yoked partic- 
ipants under sequence-independent reinforce- 
ment. Participant E7 experienced only 12 
sessions of the 10-s delay of reinforcement, 
because his erratic performance during the 
initial 10-s delay sessions consumed consider- 
able time. During the first, second, third, 
fourth, and fifth sessions for this participant, 
numbers of single responses to any of the eight 
circles, target sequences, and reinforcers, 
respectively, were 387, 5, and 5, 1077, 3, and 
3, 72, 0, and 0, 5154, 15, and 15, and 5526, 28, 
and 22. Each of these sessions lasted 60 min. 
For the last seven sessions, however, rates of 
the target sequences for Participant E7 were 
consistently higher than that for the nonrein- 
forcement session and those of his yoked 
partner, C7. 

Participants and their yoked partners re- 
ceived fewer than 20 reinforcement sessions 
under the tandem FR1 DRO 30-s schedule 
(Figure 4). Because a tandem FR1 DRO 30-s 
schedule required relatively longer times to 
obtain each reinforcer, none of the delayed- 
reinforcement participants completed 20 ses- 
sions within the contracted eight, 90-min, 
experimental periods. For all 15 and 14 
sessions for Participants E13 and E15, respec- 
tively, and for the last 11 sessions for Partici- 
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Fig. 2. Rates of the target sequence in each session for each participant exposed to the 0-s delay to reinforcement and 
for the yoked participant. Filled and open squares represent rates of the sequence in the no-reinforcement condition for 
participants subsequently exposed to the delayed reinforcement and for their yoked partners, respectively. Filled circles 
represent rates of the sequence in the delayed-reinforcement condition, whereas open circles represent rates of the 
sequence in the sequence-independent reinforcement condition. 


pant E14, rates of the target sequences under 
the tandem FR1 DRO 30-s schedule were 
higher than those under no reinforcement 
and those for the yoked participants under the 
sequence-independent reinforcement. 

Another measure of response acquisition is 
the ratio of the number of responses that 
produced reinforcers to the total number of 
responses. Some investigators have used two 
levers (with rats as subjects) and compared 
responses on a lever that produced food with 
those on a lever on which responses had no 
programmed consequences (e.g., Keely, Feola, 
& Fattal, 2007; Sutphin, Byrne, & Poling, 
1998). In the present experiment, the fre- 
quency of the two-response target sequence 
(touching first the upper-left circle, then the 
bottom-left circle) can be compared to the 
frequency of all other two-response sequences. 
These two-response sequences were counted 


only from the immediately preceding reinforc- 
er to the next reinforcer, but were not counted 
across a reinforcer. For example, events may 
occur in the following order: 

UL, BL, UL, SR + , BL, UL, 

BL, MR, UC, SR + , 

where UL, BF, MR, and UC are touches to the 
upper-left, bottom-left, middle-right, and up- 
per-center circles, respectively, and SR+ is a 
reinforcer. In this example, the two-response 
sequences are UF-BL, BF-UL, BF-UL, UF- 
BF, BF-MR, and MR-UC. The numbers of 
target sequences, nontarget sequences, and 
total sequences are two, four, and six, respec- 
tively, whereas the number of total responses is 
eight. 

Figures 5, 6, and 7 show target sequences as 
a percentage of total sequences for each 
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Fig. 3. Rates of the target sequence in each session for each participant exposed to the 10-s delay to reinforcement 
and for the yoked participant. Details are as in Figure 2. 


participant exposed to the 0-s, 10-s, and 30-s 
delays to reinforcement in each session and 
that for the yoked participant. With this 
measure, a value of 100 indicates that the 
target sequence occurred exclusively through- 
out the session, whereas a value of 0 indicates 
that nontarget sequences occurred exclusively. 
Missing data points indicate that there were no 
two-response sequences during those sessions. 
Table 1 shows the median of single responses 
(not sequences) to any of the eight circles per 
minute in the last five sessions of the delayed 
reinforcement or the sequence-independent 
yoked reinforcement conditions. 

Consistent with the previous analysis of 
target sequence rates (Figure 2), the percent- 
ages of target sequences for at least the last 19 
sessions of the 0-s delay to reinforcement were 
higher for all participants than those during 
the nonreinforcement session and those for 
the last 19 sessions for the yoked participants 
with sequence-independent reinforcement 


(Figure 5). The percentages during at least 
the final nine sessions with the 10-s delayed 
reinforcement were higher for all participants 
than those during the nonreinforcement 
session and those during respective sessions 
for the yoked participants (Figure 6). The 
percentages during the last 7 and 13 sessions 
with the 30-s delayed reinforcement for Par- 
ticipants E13 and E15, respectively, were 
higher than those during the nonreinforce- 
ment session and those during respective 
sessions for the yoked participants (Figure 7). 
Although the percentages for Participant E14 
were lower than those for the other two 
participants in the 30-s delayed reinforcement 
condition, close scrutiny revealed that these 
percentages showed an increasing trend with 
exposure to this contingency and those during 
the last nine sessions were higher than that 
during the nonreinforcement session. The 
missing data for the yoked participant’s 
(C14) respective sessions indicate that she 
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Fig. 4. Rates of the target sequence in each session for each participant exposed to the 30-s delay to reinforcement 
and for the yoked participant. Details are as in Figure 2. 


ceased all responding (see also Table 1), and 
this suggests that the target sequence devel- 
oped differentially for her partner E14 who 
was exposed to delayed reinforcement. 

Percentages of the target sequences were 
generally low for some participants exposed to 
delayed reinforcement (i.e., Participants E6, 
E7, and E14), although they were higher than 
those for the yoked partners (Figures 6 and 7). 
For these participants, rates of single responses 
to any of the eight circles were relatively high 
(Table 1). These results raise two questions. 
First, did the target sequence occur more 
frequently than any of the nontarget two- 
response sequences? Second, was the target 
sequence acquired as a single unit or was it 
embedded in a larger sequence of responses? 
Table 2 shows the two-response sequence that 
occurred most frequently and the rank of the 
frequency of the target sequence during the 
last five sessions of the 10-s and 30-s delays to 


reinforcement. The target sequence occurred 
most frequently for all participants except for 
Participants E7 and E14 for whom the fre- 
quency of the target sequence ranked quite 
low. It should be noted, however, that relative 
frequencies of the most frequent sequences 
for these participants (MR-BC for E7 and BL- 
UR for E14) were low also, and comprised only 
4.5% (E7) and 6.2% (E14) of all two-response 
sequences. 

Table 2 shows the sizes of total sequences 
that occurred from the immediately preceding 
reinforcer to the next reinforcer. Sequences 
which were repeated most frequently (domi- 
nant sequences) also are shown. Participants 
E5, E8, and E15 emitted exclusively the target 
sequence 222, 164, and 163 times out of 225 
opportunities, respectively, during the last five 
sessions, suggesting that the target sequence 
was acquired as a single unit. By contrast, 
certain long sequences that included the 
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Fig. 5. Percentages of the number of the target sequence to the number of all two-response sequences in each session 
for each participant exposed to the 0-s delay to reinforcement and for the yoked participant. Filled and open squares 
represent percentages of the target sequence in the no-reinforcement condition for participants subsequently exposed to 
the delayed reinforcement and for their yoked partners, respectively. Filled circles represent percentages of the target 
sequence in the delayed reinforcement condition, whereas open circles represent percentages of the target sequence in 
the sequence-independent reinforcement condition. Missing data indicate that percentages could not be calculated 
because no two-response sequences occurred. 


target sequence tended to be repeated for 
Participants E13 and E14. Participant E13 
emitted the sequence UL-BL-UR-BR 224 
times out of 225 opportunities, whereas 
Participant E14 emitted a 64-response se- 
quence, in which the target sequence was 
embedded in the 18 th and 19 th responses, 150 
times out of 225 opportunities. Long sequenc- 
es occurred for Participants E6 and E7 also, 
but none were repeated more than 12 times. 

It is also possible to compare the results of 
each response-dependent condition as a func- 
tion of the delay duration. The left panel of 
Figure 8 plots the medians of the target 
sequence rates across the last five sessions for 
participants in each condition of delayed 
reinforcement. The rates decreased as the 
delay interval increased. The right panel of 


Figure 8 plots the medians of the postrein- 
forcement pauses (PRPs; time between the 
termination of the consummatory response 
period and the first occurrence of the target 
sequence) across the last five sessions for 
participants in each condition of delayed 
reinforcement. The PRPs increased as the 
delay interval increased, again suggesting 
a delay-of-reinforcement-gradient consistent 
with the results of the sequence rates. 

DISCUSSION 

In general, response acquisition under 
conditions of delayed reinforcement has been 
assessed by comparing response rates with (a) 
those under no reinforcement (e.g., Lattal & 
Gleeson, 1990; Wilkenfield et al., 1992), (b) 
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Sessions 


Fig. 6. Percentages of the number of the target sequence to the number of all two-response sequences in each session 
for each participant exposed to the 10-s delay to reinforcement and for the yoked participant. Details are as in Figure 5. 


those under response-independent-yoked re- 
inforcement (e.g., Dickinson et al., 1992; 
Galuska & Woods, 2005; Lattal & Gleeson, 
1990), and (c) rates of different responses that 
produce no reinforcer (e.g., Critchfield & 
Lattal, 1993; Galuska & Woods, 2005; Keely 
et al., 2007; Wilkenfield et al., 1992). In terms 
of the present experiment, a response may be 
considered acquired by delayed reinforcement 
when the number of the target sequences per 
min and the number of target sequences as 
percentage of all two-response sequences are 
higher (a) when the target sequences were 
reinforced than when they were not rein- 
forced, and (b) for participants delivered 
reinforcers depending on their performance 
than for participants delivered reinforcers 
independently. By these criteria, all 7 partici- 
pants who were exposed to the sequence- 
dependent delayed reinforcement conditions 
acquired the two-response sequence. 


Previous experiments have shown response 
acquisition with a delay of 30 s in pigeons 
(Lattal & Gleeson, 1990) and rats (Critchfield 
& Lattal, 1993; Lattal & Gleeson, 1990; Lattal 
& Williams, 1997; Snycerski et al., 2005; but see 
also van Haaren, 1992). The present experi- 
ment replicates these results with 3 human 
participants. 

A number of experiments using nonhuman 
subjects have found that response rates show a 
delay-of-reinforcement gradient (e.g., Dickin- 
son et al., 1992; Lattal & Metzger, 1994; Reilly 
& Lattal, 2004), whereas, to our knowledge, 
only Reilly and Lattal have found gradients for 
PRPs. The delay-of-reinforcement gradients 
found for both response rates and PRPs in 
the present experiment provide additional 
evidence of the generality of this phenomenon 
across species, and they demonstrate the 
consistency of the present results with those 
of previous experiments. However, the delay- 
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Fig. 7. Percentages of the number of the target sequence to the number of all two-response sequences in each session 
for each participant exposed to the 30-s delay to reinforcement and for the yoked participant. Details are as in Figure 5. 


Table 1 

The median of single touches to any of the eight circles per minute (ranges in parentheses) for 
each participant exposed to delayed reinforcement (left) and for the yoked participant (right) in 
the last five sessions of each condition. 


Participant 

Delayed (E) 

Yoked (C) 

0-s Delay 



9 

130.1 (93.7-149.2) 

0.0 (0.0-0.0) 

10 

124.6 (109.8-130.1) 

0.0 (0.0-10.1) 

11 

151.6 (138.3-165.2) 

1.4 (0.0-5. 9) 

12 

149.3 (128.6-164.6) 

0.0 (0.0-0.0) 

10-s Delay 



5 

10.0 (9.7-10.1) 

34.1 (22.8-47.2) 

6 

82.6 (77.4-106.5) 

241.3 (49.2-280.6) 

7 

173.4 (169.9-184.7) 

17.0 (3.9-21.7) 

8 

33.3 (28.3-40.2) 

5.4 (5. 4-7.0) 

30-s Delay 



13 

7.7 (7.7— 7.7) 

35.4 (31.9-41.7) 

14 

80.7 (59.5-108.6) 

0.0 (0.0-0.0) 

15 

4.5 (4.3-5.0) 

0.0 (0.0-0.0) 
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Table 2 

Two-response sequences occurring most frequently, ranks of frequency of the target sequence, 
median sizes of sequence per reinforcer (ranges in parentheses), and dominant sequences. Data 
are pooled across the last five sessions of training for each participant exposed to 
delayed reinforcement. 



Two-response sequence 

Total 

sequence 

Participant 

Most frequent 
sequence 

Rank of target 
sequence 

Size of sequence 

Dominant sequence 3 

10-s delay 

E5 

UL-BL 

i 

2 (2-4) 

UL-BL 

E6 

UL-BL 

i 

23 (8-256) 

No dominant sequence 

E7 

MR-BC 

24 

90 (36-564) 

No dominant sequence 

E8 

UL-BL 

1 

2 (2-109) 

UL-BL 

30-s delay 

E13 

UL-BL 

1 

4 (4-8) 

UL-BL-UR-BR 

E14 

BL-UR 

19 

64 (35-129) 

64-response sequence 

E15 

UL-BL 

1 

2 (2-16) 

UL-BL 


Note. U, M, B, L, C, and R, denote upper, middle, bottom, left, center, and right, respectively, and indicate the location 
of circles touched. UL-BL, for example, is the target sequence of touching first the upper-left then the bottom-left 
circles, a. See text for details. 


of-reinforcement gradients obtained from the 
present experiment have a limitation: the 
resetting delay contingency can restrict rates 
of responding (Critchfield & Lattal, 1993; 
Sutphin et al., 1998). Under such a contin- 
gency, the most efficient performance is to 
respond once immediately after the prior 
reinforcement, then to pause until a reinforcer 
occurs. Thus, even response rates produced by 
efficient performance can decrease as the delay 
interval is increased. It is very possible, there- 
fore, that the resetting delay contingency 
contributed to the delay-of-reinforcement gra- 
dients found with response rates. Although the 
resetting delay contingency should not affect 
PRPs directly, it may have had an indirect 




Delay Value (s) Delay Value (s) 

Fig. 8. Medians of the rates of the target sequence 
(left) and the postreinforcement pauses (right) for the last 
five sessions in each condition of delayed reinforcement. 
Unfilled circles show data from individual participants and 
solid lines connect group medians. 


influence; that is, a general reduction in 
response rate produced by the resetting delay 
contingency could contribute to the PRP’s 
results. 

On time-based or interval schedules, organ- 
isms can emit behavior that is not required by 
the contingency (Pierce & Epling, 1995, 
p. 139). When lever pressing produced food 
according to a VI schedule, for example, rats 
also drank an extreme amount of water even 
though this was unrelated to contingencies of 
reinforcement (Falk, 1961). In the context of 
response acquisition with delayed reinforce- 
ment, rats pressed not only a lever which 
produced food with resetting delay but also 
another lever on which responses had no 
programmed consequences (e.g., Sutphin et 
al., 1998; Wilkenfield et al., 1992). Consistent 
with these previous findings, 4 of 7 participants 
exposed to the resetting delay contingency 
frequently emitted responses other than the 
target sequences. These results suggest that for 
some participants the target two-response se- 
quence was acquired not as a single unit but as a 
member of a longer sequence. These results, 
however, do not alter the conclusion that the 
target sequence was acquired. Whether the 
target sequence was a single unit or it was 
embedded in a longer sequence of responses, 
the target two-response sequences occurred at 
higher overall and relative rates with delayed 
reinforcement than with no reinforcement or 
with response-independent reinforcement. 
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The present results may suggest that re- 
sponse acquisition with delayed reinforcement 
is general across classes of responses. Pecking a 
key (Lattal & Gleeson, 1990), pressing a lever 
(e.g., Lattal & Gleeson, 1990; Wilkenfield et 
al., 1992), and breaking a photocell beam 
(Critchfield & Lattal, 1993; Lattal & Metzger, 
1994) have been acquired in previous exper- 
iments. The present experiment found that 
not only single responses but more complex 
classes of response (i.e., touching two circles in 
a particular sequence) also can be acquired 
with delayed reinforcement. However, a fea- 
ture of the present results, by contrast, limits 
the generality across response classes with 
humans as subjects. As described in the 
Introduction, the present experiment em- 
ployed a two-response sequence as a response 
class because humans often respond to a single 
operandum without reinforcement (Kaufman 
et al., 1966). Performance during the nonre- 
inforcement session of the present experiment 
highlights this problem; there were often high 
operant levels of single responses. The 22 
participants touched any of the eight circles 10 
to 1055 (median = 191) times during the 10- 
min nonreinforcement session. Thus, al- 
though the present experiment showed acqui- 
sition of a two-response sequence, these high 
rates of single responses suggest it would be 
more difficult to demonstrate acquisition of a 
simple response to a single operandum in 
humans. 
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